AITopics | interpretable model

Collaborating Authors

interpretable model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generalized Functional ANOVA in Closed-Form: A Unified View of Additive Explanations

Ferrere, Baptiste, Bousquet, Nicolas, Gamboa, Fabrice, Loubes, Jean-Michel

arXiv.org Machine LearningMay-19-2026

The functional ANOVA, or Hoeffding decomposition, provides a principled framework for interpretability by decomposing a model prediction into main effects and higher-order interactions. For independent inputs, this classical decomposition is explicit. It is closely connected to SHAP values, generalized additive models, and orthogonal polynomial expansions, and therefore constitutes a fundamental tool for additive explainability. In the more general and realistic dependent setting, however, obtaining a tractable representation and estimating the decomposition from data remain challenging. In this work, we address this problem for continuous inputs. By combining Hilbert space methods with the generalized functional ANOVA, we build an explicit decomposition Riesz Basis allowing to easily compute the decomposition. Our formulation recovers the classical independent case and its associated orthogonal decomposition. Building on this representation, we propose a simple but mighty algorithm to estimate the decomposition from a data sample in a model-agnostic setting and we compare it empirically with several state-of-the-art explanation methods, demonstrating the power of the approach.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2605.18422

Genre: Research Report > New Finding (0.46)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Gender: FemaleAge: YoungHair Color: BlondeSkin: WhiteEmotion: SeriousBeard: NoMakeup: No

Neural Information Processing SystemsApr-24-2026, 21:33:11 GMT

Machine learning models can frequently produce systematic errors on critical subsets (or slices) of data that share common attributes. Discovering and explaining such model bugs is crucial for reliable model deployment. However, existing bug discovery and interpretation methods usually involve heavy human intervention and annotation, which can be cumbersome and have low bug coverage. In this paper, we propose HiBug, an automated framework for interpretable model debugging. Our approach utilizes large pre-trained models, such as chatGPT, to suggest human-understandable attributes that are related to the targeted computer vision tasks. By leveraging pre-trained vision-language models, we can efficiently identify common visual attributes of underperforming data slices using humanunderstandable terms. This enables us to uncover rare cases in the training data, identify spurious correlations in the model, and use the interpretable debug results to select or generate new training data for model improvement. Experimental results demonstrate the efficacy of the HiBug framework. Code is available at: https://github.com/cure-lab/HiBug.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.46)
North America (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

72d50a87b218d84c175d16f4557f7e12-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 20:29:43 GMT

explanation, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Saarland > Saarbrücken (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

ExpO

Gregory Plumb

Neural Information Processing SystemsFeb-9-2026, 00:18:04 GMT

agent, explanation, local explanation, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

HiBug: On Human-Interpretable Model Debug Muxi Chen, Yu Li, Qiang Xu The Chinese University of Hong Kong

Neural Information Processing SystemsFeb-7-2026, 21:45:05 GMT

Experimental results demonstrate the efficacy of the HiBug framework.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.40)
North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

Discriminative Feature Attributions: Bridging Post Hoc Explainability and Inherent Interpretability

Neural Information Processing SystemsDec-26-2025, 07:22:05 GMT

With the increased deployment of machine learning models in various real-world applications, researchers and practitioners alike have emphasized the need for explanations of model behaviour. To this end, two broad strategies have been outlined in prior literature to explain models. Post hoc explanation methods explain the behaviour of complex black-box models by identifying features critical to model predictions; however, prior work has shown that these explanations may not be faithful, in that they incorrectly attribute high importance to features that are unimportant or non-discriminative for the underlying task. Inherently interpretable models, on the other hand, circumvent these issues by explicitly encoding explanations into model architecture, meaning their explanations are naturally faithful, but they often exhibit poor predictive performance due to their limited expressive power. In this work, we identify a key reason for the lack of faithfulness of feature attributions: the lack of robustness of the underlying black-box models, especially the erasure of unimportant distractor features in the input. To address this issue, we propose Distractor Erasure Tuning (DiET), a method that adapts black-box models to be robust to distractor erasure, thus providing discriminative and faithful attributions. This strategy naturally combines the ease-of-use of post hoc explanations with the faithfulness of inherently interpretable models. We perform extensive experiments on semi-synthetic and real-world datasets, and show that DiET produces models that (1) closely approximate the original black-box models they are intended to explain, and (2) yield explanations that match approximate ground truths available by construction.

bridging post hoc explainability, discriminative feature attribution, explanation, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning (0.75)

Add feedback

Washing The Unwashable : On The (Im)possibility of Fairwashing Detection

Neural Information Processing SystemsDec-24-2025, 06:47:43 GMT

The use of black-box models (e.g., deep neural networks) in high-stakes decision-making systems, whose internal logic is complex, raises the need for providing explanations about their decisions. Model explanation techniques mitigate this problem by generating an interpretable and high-fidelity surrogate model (e.g., a logistic regressor or decision tree) to explain the logic of black-box models. In this work, we investigate the issue of fairwashing, in which model explanation techniques are manipulated to rationalize decisions taken by an unfair black-box model using deceptive surrogate models. More precisely, we theoretically characterize and analyze fairwashing, proving that this phenomenon is difficult to avoid due to an irreducible factor---the unfairness of the black-box model. Based on the theory developed, we propose a novel technique, called FRAUD-Detect (FaiRness AUDit Detection), to detect fairwashed models by measuring a divergence over subpopulation-wise fidelity measures of the interpretable model. We empirically demonstrate that this divergence is significantly larger in purposefully fairwashed interpretable models than in honest ones. Furthermore, we show that our detector is robust to an informed adversary trying to bypass our detector.

black-box model, name change, unwashable, (9 more...)

Neural Information Processing Systems

Technology: